Goto

Collaborating Authors

 universum data


Granular Ball Twin Support Vector Machine with Universum Data

Ganaie, M. A., Ahire, Vrushank

arXiv.org Artificial Intelligence

Innovative Data Representation with Granular Balls: The GBU-TSVM model employs an innovative approach by representing data instances as granular balls rather than conventional points. This method improves the model's robustness and efficiency, especially in handling noisy and large datasets. By grouping data points into granular balls, the model achieves better computational efficiency, increased noise resistance, and enhanced interpretability, establishing a new standard in data representation. Enhanced Generalization using Universum Data: The GBU-TSVM incorporates Universum data, which includes samples outside the target classes, to significantly improve generalization capabilities. Universum data enables the classifier to perform better on benchmark datasets, demonstrating the model's ability to utilize additional knowledge for more precise predictions. Refined Learning with Modified Hinge Loss Function: The model includes an advanced hinge loss function that accounts for the radii of granular balls, leading to a more accurate error measure and learning process. This modification allows for a detailed error assessment, enhancing the model's learning efficiency and decision boundary precision. By addressing the limitations of existing TSVM models, this innovation sets a new benchmark in the field of machine learning classifiers.


Kernel-Free Universum Quadratic Surface Twin Support Vector Machines for Imbalanced Data

Moosaei, Hossein, Hladík, Milan, Mousavi, Ahmad, Gao, Zheming, Fu, Haojie

arXiv.org Artificial Intelligence

Binary classification tasks with imbalanced classes pose significant challenges in machine learning. Traditional classifiers often struggle to accurately capture the characteristics of the minority class, resulting in biased models with subpar predictive performance. In this paper, we introduce a novel approach to tackle this issue by leveraging Universum points to support the minority class within quadratic twin support vector machine models. Unlike traditional classifiers, our models utilize quadratic surfaces instead of hyperplanes for binary classification, providing greater flexibility in modeling complex decision boundaries. By incorporating Universum points, our approach enhances classification accuracy and generalization performance on imbalanced datasets. We generated four artificial datasets to demonstrate the flexibility of the proposed methods. Additionally, we validated the effectiveness of our approach through empirical evaluations on benchmark datasets, showing superior performance compared to conventional classifiers and existing methods for imbalanced classification.


Intuitionistic Fuzzy Universum Twin Support Vector Machine for Imbalanced Data

Quadir, A., Tanveer, M.

arXiv.org Artificial Intelligence

One of the major difficulties in machine learning methods is categorizing datasets that are imbalanced. This problem may lead to biased models, where the training process is dominated by the majority class, resulting in inadequate representation of the minority class. Universum twin support vector machine (UTSVM) produces a biased model towards the majority class, as a result, its performance on the minority class is often poor as it might be mistakenly classified as noise. Moreover, UTSVM is not proficient in handling datasets that contain outliers and noises. Inspired by the concept of incorporating prior information about the data and employing an intuitionistic fuzzy membership scheme, we propose intuitionistic fuzzy universum twin support vector machines for imbalanced data (IFUTSVM-ID). We use an intuitionistic fuzzy membership scheme to mitigate the impact of noise and outliers. Moreover, to tackle the problem of imbalanced class distribution, data oversampling and undersampling methods are utilized. Prior knowledge about the data is provided by universum data. This leads to better generalization performance. UTSVM is susceptible to overfitting risks due to the omission of the structural risk minimization (SRM) principle in their primal formulations. However, the proposed IFUTSVM-ID model incorporates the SRM principle through the incorporation of regularization terms, effectively addressing the issue of overfitting. We conduct a comprehensive evaluation of the proposed IFUTSVM-ID model on benchmark datasets from KEEL and compare it with existing baseline models. Furthermore, to assess the effectiveness of the proposed IFUTSVM-ID model in diagnosing Alzheimer's disease (AD), we applied them to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. Experimental results showcase the superiority of the proposed IFUTSVM-ID models compared to the baseline models.


Universum-inspired Supervised Contrastive Learning

Han, Aiyang, Geng, Chuanxing, Chen, Songcan

arXiv.org Artificial Intelligence

As an effective data augmentation method, Mixup synthesizes an extra amount of samples through linear interpolations. Despite its theoretical dependency on data properties, Mixup reportedly performs well as a regularizer and calibrator contributing reliable robustness and generalization to deep model training. In this paper, inspired by Universum Learning which uses out-of-class samples to assist the target tasks, we investigate Mixup from a largely under-explored perspective - the potential to generate in-domain samples that belong to none of the target classes, that is, universum. We find that in the framework of supervised contrastive learning, Mixup-induced universum can serve as surprisingly high-quality hard negatives, greatly relieving the need for large batch sizes in contrastive learning. With these findings, we propose Universum-inspired supervised Contrastive learning (UniCon), which incorporates Mixup strategy to generate Mixup-induced universum as universum negatives and pushes them apart from anchor samples of the target classes. We extend our method to the unsupervised setting, proposing Unsupervised Universum-inspired contrastive model (Un-Uni). Our approach not only improves Mixup with hard labels, but also innovates a novel measure to generate universum data. With a linear classifier on the learned representations, UniCon shows state-of-the-art performance on various datasets. Specially, UniCon achieves 81.7% top-1 accuracy on CIFAR-100, surpassing the state of art by a significant margin of 5.2% with a much smaller batch size, typically, 256 in UniCon vs. 1024 in SupCon using ResNet-50. Un-Uni also outperforms SOTA methods on CIFAR-100. The code of this paper is released on https://github.com/hannaiiyanggit/UniCon.


An Analysis of Inference with the Universum

Chapelle, Olivier, Agarwal, Alekh, Sinz, Fabian H., Schölkopf, Bernhard

Neural Information Processing Systems

We study a pattern classification algorithm which has recently been proposed by Vapnik and coworkers. It builds on a new inductive principle which assumes that in addition to positive and negative data, a third class of data is available, termed the Universum. We assay the behavior of the algorithm by establishing links with Fisher discriminant analysis and oriented PCA, as well as with an SVM in a projected subspace (or, equivalently, with a data-dependent reduced kernel). We also provide experimental results.


An Analysis of Inference with the Universum

Chapelle, Olivier, Agarwal, Alekh, Sinz, Fabian H., Schölkopf, Bernhard

Neural Information Processing Systems

We study a pattern classification algorithm which has recently been proposed by Vapnik and coworkers. It builds on a new inductive principle which assumes that in addition to positive and negative data, a third class of data is available, termed the Universum. We assay the behavior of the algorithm by establishing links with Fisher discriminant analysis and oriented PCA, as well as with an SVM in a projected subspace (or, equivalently, with a data-dependent reduced kernel). We also provide experimental results.